Setting up Fluentd on Amazon Linux: A step-by-step guide
Hello, my name is Aayush,
I recently discovered a service called fluentd that collects log data and sends it to a variety of destinations.
so i will blog the steps and
What is Fluentd?
Fluentd is a free and open-source data gathering and forwarding programme that can be used for log processing, metric collection, and other data aggregation operations. It is intended to be highly scalable and adaptable, capable of processing massive amounts of data from numerous sources and routing it to various destinations such as Elasticsearch, Hadoop, or Amazon S3. Fluentd supports over 500 plugins and can be modified to include custom plugins for various data sources and destinations. It is often used to gather and manage log data created by multiple containers and microservices in cloud-native systems such as Kubernetes and Docker.
Here are the steps to configure Fluentd on Amazon Linux:
Install Fluentd:
The first step is to download and install Fluentd on your computer or server. Fluentd is compatible with a number of operating systems, including Linux, macOS, and Windows. Fluentd can be installed via package managers such as yum, apt-get, or Homebrew, or it can be downloaded and installed manually from the Fluentd website.
Update the system packages:
sudo yum update
Install the required dependencies:
$sudo yum install -y ruby-devel gcc gcc-c++ make
Install Fluentd using RubyGems:
$sudo gem install fluentd
td-agent
To handle plugin gems, td-agent-gem is utilised. The following command, for example, installs a plugin to connect to S3:
$sudo /usr/sbin/td-agent-gem install fluent-plugin-s3
Configure Fluentd:
After installing Fluentd, you must generate a configuration file that describes the input, output, and filtering plugins you wish to use. The configuration file is commonly written in YAML and can be saved in the /etc/fluentd directory or elsewhere. The Fluentd configuration file can get complicated depending on the amount and complexity of plugins used.
Bellow are the steps
$sudo mkdir /etc/fluentd
$sudo touch /etc/fluentd/fluent.conf
Set up input plugins: Fluentd input plugins are used to collect data from different sources, such as log files, system logs, or network protocols like syslog or TCP. You can configure the input plugins in your Fluentd configuration file to read data from your desired source.
Set up output plugins: Fluentd output plugins are used to forward data to various destinations, such as Elasticsearch, Hadoop, or cloud storage services like Amazon S3. You can configure the output plugins in your Fluentd configuration file to send data to your desired destination.
Set up filtering plugins: Fluentd filtering plugins are used to manipulate and transform data before it is forwarded to the output plugins. You can configure the filtering plugins in your Fluentd configuration file to apply filters to the data collected from the input plugins.
For example, to collect logs from the Apache web server and forward them to s3, you can use the following configuration:
<source> @type tail path /var/log/httpd/access_log pos_file /var/log/td-agent/httpd-access.pos tag apache.access format apache </source> <filter apache.access> @type parser key_name message format /^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>[^ ]*) (?<path>[^ ]*) [^"]*" (?<code>[^ ]*) (?<size>[^ ]*) "([^"]*)" "([^"]*)"$/ time_key time time_format %d/%b/%Y:%H:%M:%S %z reserve_data true </filter> <match apache.access> @type s3 aws_key_id YOUR_AWS_KEY_ID aws_sec_key YOUR_AWS_SECRET_KEY s3_bucket YOUR_S3_BUCKET_NAME s3_region ap-northeast-1 path logs/ # if you want to use ${tag} or %Y/%m/%d/ like syntax in path / s3_object_key_format, # need to specify tag for ${tag} and time for %Y/%m/%d in <buffer> argument. <buffer tag,time> @type file path /var/log/fluent/s3 timekey 3600 # 1 hour partition timekey_wait 10m timekey_use_utc true # use utc chunk_limit_size 256m </buffer> </match>
Save and close the configuration file.
Conclusion
With Fluentd flexibility and versatility, you can easily customize it to meet your specific data collection and forwarding needs. if you are considering for open source Data collection you should give a try to fluentd